Identifying regional dialects in online social media

نویسنده

  • Jacob Eisenstein
چکیده

Electronic social media offers new opportunities for informal communication in written language, while at the same time, providing new datasets that allow researchers to document dialect variation from records of natural communication among millions of individuals. The unprecedented scale of this data enables the application of quantitative methods to automatically discover the lexical variables that distinguish the language of geographical areas such as cities. This can be paired with the segmentation of geographical space into dialect regions, within the context of a single joint statistical model — thus simultaneously identifying coherent dialect regions and the words that distinguish them. Finally, a diachronic analysis reveals rapid changes in the geographical distribution of these lexical features, suggesting that statistical analysis of social media may offer new insights on the diffusion of lexical change.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transliteration of Arabizi into Arabic Orthography: Developing a Parallel Annotated Arabizi-Arabic Script SMS/Chat Corpus

This paper describes the process of creating a novel resource, a parallel Arabizi-Arabic script corpus of SMS/Chat data. The language used in social media expresses many differences from other written genres: its vocabulary is informal with intentional deviations from standard orthography such as repeated letters for emphasis; typos and nonstandard abbreviations are common; and nonlinguistic co...

متن کامل

Identifying Customer Journey Opportunities in 5A Model in Tourism Industry

Purpose: Growing development of technologies helped stronger customers with better relationship with companies. Consequently, marketers should pursue new ways of attracting customers and pathway. Modern customer buying path in the age of communication has been redesigned as 5A model (Aware, Appeal, Ask, Act, Advocate). The purpose is to identify customer opportunities of 5A model in tourism ind...

متن کامل

Considering the Future of Pharmaceutical Promotions in Social Media; Comment on “Trouble Spots in Online Direct-to-Consumer Prescription Drug Promotion: A Content Analysis of FDA Warning Letters”

This commentary explores the implications of increased social media marketing by drug manufacturers, based on findings in Hyosun Kim’s article of the major themes in recent Food and Drug Administration (FDA) warning letters and notices of violation regarding online direct-to-consumer promotions of pharmaceuticals. Kim’s rigorous analysis of FDA letters over a 10-year span highlights a relative ...

متن کامل

Freshman or Fresher? Quantifying the Geographic Variation of Language in Online Social Media

In this paper we present a new computational technique to detect and analyze statistically significant geographic variation in language. While previous approaches have primarily focused on lexical variation between regions, our method identifies words that demonstrate semantic and syntactic variation as well. Our meta-analysis approach captures statistical properties of word usage across geogra...

متن کامل

Social Campaigns on Online Platforms as a New Form of Public Sphere in Digital Era: A Critical Review

Nowadays with the ever-increasing growth in social media platforms and the creation of different forms of online activism, the word known as “Campaign” has become a familiar and useful term in people’s everyday lives. Campaigns with all kinds of social aims especially using Hashtags are run on social media platforms by individuals, charities, NGOs, governments, municipalities and brand companie...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014